A Functional Model for Dataspace Management Systems

نویسندگان

  • Cornelia Hedeler
  • Alvaro A. A. Fernandes
  • Khalid Belhajjame
  • Lu Mao
  • Chenjuan Guo
  • Norman W. Paton
  • Suzanne M. Embury
چکیده

Dataspace management systems (DSMSs) hold the promise of pay-asyou-go data integration. We describe a comprehensive model of DSMS functionality using an algebraic style. We begin by characterizing a dataspace life cycle and highlighting opportunities for both automation and user-driven improvement techniques. Building on the observation that many of the techniques developed in model management are of use in data integration contexts as well, we briefly introduce the model management area and explain how previous work on both data integration and model management needs extending if the full dataspace life cycle is to be supported. We show that many model management operators already enable important functionality (e.g., the merging of schemas, the composition of mappings, etc.) and formulate these capabilities in an algebraic structure, thereby giving rise to the notion of the core functionality of a DSMS as a many-sorted algebra. Given this view, we show how core tasks in the dataspace life cycle can be enacted by means of algebraic programs. An extended case study illustrates how such algebraic programs capture a challenging, practical scenario. Alvaro A. A. Fernandes University of Manchester, e-mail: [email protected] Cornelia Hedeler University of Manchester, e-mail: [email protected] Khalid Belhajjame University of Manchester, e-mail: [email protected] Lu Mao University of Manchester, e-mail: [email protected] Chenjuan Guo University of Manchester, e-mail: [email protected] Norman W. Paton University of Manchester, e-mail: [email protected] Suzanne M. Embury University of Manchester, e-mail: [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Search Ranking for Heterogeneous Data over Dataspace

Traditional relational database systems queries works over structured data, whereas information retrieval systems are designed for additional versatile and flexible ranked keyword queries, works over unstructured data, Semistructured, Streamed data, Social networking data and data without any format, known as heterogeneous data. However, several new and emerging applications need data managemen...

متن کامل

Verification and prototyping of distributed dataspace applications

The space calculus is introduced as a language to model distributed dataspace systems, i.e. distributed applications that use a shared (but possibly distributed) dataspace to coordinate. The publish-subscribe and the global dataspace are particular instances of our model. We give the syntax and operational semantics of this language and provide tool support for functional and performance analys...

متن کامل

DSToolkit: An Architecture for Flexible Dataspace Management

The vision of dataspaces is to provide various of the benefits of classical data integration, but with reduced up-front costs. Combining this with opportunities for incremental refinement enables a ‘pay-as-yougo’ approach to data integration, resulting in simplified integrated access to distributed data. It has been speculated that model management could provide the basis for Dataspace Manageme...

متن کامل

Principles and Model for Web Dataspace

Web information integrated management system requires a powerful and versatile data model that is able to represent a highly heterogeneous mix of data such as web pages, XML, deep web, files, etc. It requires access to both structured and unstructured data. Such collections of data have been referred to as dataspace. In order to build a web dataspace support platform, we described some principl...

متن کامل

An Approach for Designing and Implementing a Visual XML Dataspace System

Dataspace systems constitute a recent data management approach to enabling better cooperation among autonomous and heterogeneous data sources with which the user is initially unfamiliar. A central idea is to gradually increase the user's knowledge about the contents, structures, and semantics of the data sources in the dataspace. Without this knowledge, the user is not able to make sophisticate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013